517 research outputs found

    Roc632: An overview

    Get PDF
    The present paper aims to analyze and explore the ROC632 package, specifying its main characteristics and functions. More specifically, the goal of this study is the evaluation of the effectiveness of the package and its strengths and weaknesses. This package was created in order to overcome the lack of information concerning incomplete time-to-event data, adapting the 0.632+ bootstrap estimator for the evaluation of time dependent ROC curves. By applying this package to a specific dataset (DLBCLpatients), it becomes possible to assess tangible data, determining if it is able to analyze complete and incomplete data efficiently and without bias.(undefined)info:eu-repo/semantics/publishedVersio

    A Dynamic Noise Level Algorithm for Spectral Screening of Peptide MS/MS Spectra

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>High-throughput shotgun proteomics data contain a significant number of spectra from non-peptide ions or spectra of too poor quality to obtain highly confident peptide identifications. These spectra cannot be identified with any positive peptide matches in some database search programs or are identified with false positives in others. Removing these spectra can improve the database search results and lower computational expense.</p> <p>Results</p> <p>A new algorithm has been developed to filter tandem mass spectra of poor quality from shotgun proteomic experiments. The algorithm determines the noise level dynamically and independently for each spectrum in a tandem mass spectrometric data set. Spectra are filtered based on a minimum number of required signal peaks with a signal-to-noise ratio of 2. The algorithm was tested with 23 sample data sets containing 62,117 total spectra.</p> <p>Conclusions</p> <p>The spectral screening removed 89.0% of the tandem mass spectra that did not yield a peptide match when searched with the MassMatrix database search software. Only 6.0% of tandem mass spectra that yielded peptide matches considered to be true positive matches were lost after spectral screening. The algorithm was found to be very effective at removal of unidentified spectra in other database search programs including Mascot, OMSSA, and X!Tandem (75.93%-91.00%) with a small loss (3.59%-9.40%) of true positive matches.</p

    ProtQuant: a tool for the label-free quantification of MudPIT proteomics data

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Effective and economical methods for quantitative analysis of high throughput mass spectrometry data are essential to meet the goals of directly identifying, characterizing, and quantifying proteins from a particular cell state. Multidimensional Protein Identification Technology (MudPIT) is a common approach used in protein identification. Two types of methods are used to detect differential protein expression in MudPIT experiments: those involving stable isotope labelling and the so-called label-free methods. Label-free methods are based on the relationship between protein abundance and sampling statistics such as peptide count, spectral count, probabilistic peptide identification scores, and sum of peptide Sequest XCorr scores (Ξ£XCorr). Although a number of label-free methods for protein quantification have been described in the literature, there are few publicly available tools that implement these methods. We describe ProtQuant, a Java-based tool for label-free protein quantification that uses the previously published Ξ£XCorr method for quantification and includes an improved method for handling missing data.</p> <p>Results</p> <p><it>ProtQuant </it>was designed for ease of use and portability for the bench scientist. It implements the Ξ£XCorr method for label free protein quantification from MudPIT datasets. <it>ProtQuant </it>has a graphical user interface, accepts multiple file formats, is not limited by the size of the input files, and can process any number of replicates and any number of treatments. In addition,<it>ProtQuant </it>implements a new method for dealing with missing values for peptide scores used for quantification. The new algorithm, called Ξ£XCorr*, uses "below threshold" peptide scores to provide meaningful non-zero values for missing data points. We demonstrate that Ξ£XCorr* produces an average reduction in false positive identifications of differential expression of 25% compared to Ξ£XCorr.</p> <p>Conclusion</p> <p><it>ProtQuant </it>is a tool for protein quantification built for multi-platform use with an intuitive user interface. <it>ProtQuant </it>efficiently and uniquely performs label-free quantification of protein datasets produced with Sequest and provides the user with facilities for data management and analysis. Importantly, <it>ProtQuant </it>is available as a self-installing executable for the Windows environment used by many bench scientists.</p

    A mass accuracy sensitive probability based scoring algorithm for database searching of tandem mass spectrometry data

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Liquid chromatography coupled with tandem mass spectrometry (LC-MS/MS) has become one of the most used tools in mass spectrometry based proteomics. Various algorithms have since been developed to automate the process for modern high-throughput LC-MS/MS experiments.</p> <p>Results</p> <p>A probability based statistical scoring model for assessing peptide and protein matches in tandem MS database search was derived. The statistical scores in the model represent the probability that a peptide match is a random occurrence based on the number or the total abundance of matched product ions in the experimental spectrum. The model also calculates probability based scores to assess protein matches. Thus the protein scores in the model reflect the significance of protein matches and can be used to differentiate true from random protein matches.</p> <p>Conclusion</p> <p>The model is sensitive to high mass accuracy and implicitly takes mass accuracy into account during scoring. High mass accuracy will not only reduce false positives, but also improves the scores of true positive matches. The algorithm is incorporated in an automated database search program MassMatrix.</p

    Proteomic analysis of the Plasmodium male gamete reveals the key role for glycolysis in flagellar motility.

    Get PDF
    BACKGROUND: Gametogenesis and fertilization play crucial roles in malaria transmission. While male gametes are thought to be amongst the simplest eukaryotic cells and are proven targets of transmission blocking immunity, little is known about their molecular organization. For example, the pathway of energy metabolism that power motility, a feature that facilitates gamete encounter and fertilization, is unknown. METHODS: Plasmodium berghei microgametes were purified and analysed by whole-cell proteomic analysis for the first time. Data are available via ProteomeXchange with identifier PXD001163. RESULTS: 615 proteins were recovered, they included all male gamete proteins described thus far. Amongst them were the 11 enzymes of the glycolytic pathway. The hexose transporter was localized to the gamete plasma membrane and it was shown that microgamete motility can be suppressed effectively by inhibitors of this transporter and of the glycolytic pathway. CONCLUSIONS: This study describes the first whole-cell proteomic analysis of the malaria male gamete. It identifies glycolysis as the likely exclusive source of energy for flagellar beat, and provides new insights in original features of Plasmodium flagellar organization

    ETISEQ – an algorithm for automated elution time ion sequencing of concurrently fragmented peptides for mass spectrometry-based proteomics

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Concurrent peptide fragmentation (i.e. shotgun CID, parallel CID or MS<sup>E</sup>) has emerged as an alternative to data-dependent acquisition in generating peptide fragmentation data in LC-MS/MS proteomics experiments. Concurrent peptide fragmentation data acquisition has been shown to be advantageous over data-dependent acquisition by providing greater detection dynamic range and providing more accurate quantitative information. Nevertheless, concurrent peptide fragmentation data acquisition remains to be widely adopted due to the lack of published algorithms designed specifically to process or interpret such data acquired on any mass spectrometer.</p> <p>Results</p> <p>An algorithm called Elution Time Ion Sequencing (ETISEQ), has been developed to enable automated conversion of concurrent peptide fragmentation data acquisition data to LC-MS/MS data. ETISEQ generates MS/MS-like spectra based on the correlation of precursor and product ion elution profiles. The performance of ETISEQ is demonstrated using concurrent peptide fragmentation data from tryptic digests of standard proteins and whole influenza virus. It is shown that the number of unique peptides identified from the digests is broadly comparable between ETISEQ processed concurrent peptide fragmentation data and the data-dependent acquired LC-MS/MS data.</p> <p>Conclusion</p> <p>The ETISEQ algorithm has been designed for easy integration with existing MS/MS analysis platforms. It is anticipated that it will popularize concurrent peptide fragmentation data acquisition in proteomics laboratories.</p

    Identification of alternative splice variants in Aspergillus flavus through comparison of multiple tandem MS search algorithms

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Database searching is the most frequently used approach for automated peptide assignment and protein inference of tandem mass spectra. The results, however, depend on the sequences in target databases and on search algorithms. Recently by using an alternative splicing database, we identified more proteins than with the annotated proteins in <it>Aspergillus flavus</it>. In this study, we aimed at finding a greater number of eligible splice variants based on newly available transcript sequences and the latest genome annotation. The improved database was then used to compare four search algorithms: Mascot, OMSSA, X! Tandem, and InsPecT.</p> <p>Results</p> <p>The updated alternative splicing database predicted 15833 putative protein variants, 61% more than the previous results. There was transcript evidence for 50% of the updated genes compared to the previous 35% coverage. Database searches were conducted using the same set of spectral data, search parameters, and protein database but with different algorithms. The false discovery rates of the peptide-spectrum matches were estimated < 2%. The numbers of the total identified proteins varied from 765 to 867 between algorithms. Whereas 42% (1651/3891) of peptide assignments were unanimous, the comparison showed that 51% (568/1114) of the RefSeq proteins and 15% (11/72) of the putative splice variants were inferred by all algorithms. 12 plausible isoforms were discovered by focusing on the consensus peptides which were detected by at least three different algorithms. The analysis found different conserved domains in two putative isoforms of UDP-galactose 4-epimerase.</p> <p>Conclusions</p> <p>We were able to detect dozens of new peptides using the improved alternative splicing database with the recently updated annotation of the <it>A. flavus </it>genome. Unlike the identifications of the peptides and the RefSeq proteins, large variations existed between the putative splice variants identified by different algorithms. 12 candidates of putative isoforms were reported based on the consensus peptide-spectrum matches. This suggests that applications of multiple search engines effectively reduced the possible false positive results and validated the protein identifications from tandem mass spectra using an alternative splicing database.</p

    Harvest: an open-source tool for the validation and improvement of peptide identification metrics and fragmentation exploration

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Protein identification using mass spectrometry is an important tool in many areas of the life sciences, and in proteomics research in particular. Increasing the number of proteins correctly identified is dependent on the ability to include new knowledge about the mass spectrometry fragmentation process, into computational algorithms designed to separate true matches of peptides to unidentified mass spectra from spurious matches. This discrimination is achieved by computing a function of the various features of the potential match between the observed and theoretical spectra to give a numerical approximation of their similarity. It is these underlying "metrics" that determine the ability of a protein identification package to maximise correct identifications while limiting false discovery rates. There is currently no software available specifically for the simple implementation and analysis of arbitrary novel metrics for peptide matching and for the exploration of fragmentation patterns for a given dataset.</p> <p>Results</p> <p>We present Harvest: an open source software tool for analysing fragmentation patterns and assessing the power of a new piece of information about the MS/MS fragmentation process to more clearly differentiate between correct and random peptide assignments. We demonstrate this functionality using data metrics derived from the properties of individual datasets in a peptide identification context. Using Harvest, we demonstrate how the development of such metrics may improve correct peptide assignment confidence in the context of a high-throughput proteomics experiment and characterise properties of peptide fragmentation.</p> <p>Conclusions</p> <p>Harvest provides a simple framework in C++ for analysing and prototyping metrics for peptide matching, the core of the protein identification problem. It is not a protein identification package and answers a different research question to packages such as Sequest, Mascot, X!Tandem, and other protein identification packages. It does not aim to maximise the number of assigned peptides from a set of unknown spectra, but instead provides a method by which researchers can explore fragmentation properties and assess the power of novel metrics for peptide matching in the context of a given experiment. Metrics developed using Harvest may then become candidates for later integration into protein identification packages.</p

    Disulphide Bridges of Phospholipase C of Chlamydomonas reinhardtii Modulates Lipid Interaction and Dimer Stability

    Get PDF
    BACKGROUND: Phospholipase C (PLC) is an enzyme that plays pivotal role in a number of signaling cascades. These are active in the plasma membrane and triggers cellular responses by catalyzing the hydrolysis of membrane phospholipids and thereby generating the secondary messengers. Phosphatidylinositol-PLC (PI-PLC) specifically interacts with phosphoinositide and/or phosphoinositol and catalyzes specific cleavage of sn-3- phosphodiester bond. Several isoforms of PLC are known to form and function as dimer but very little is known about the molecular basis of the dimerization and its importance in the lipid interaction. PRINCIPAL FINDINGS: We herein report that, the disruption of disulphide bond of a novel PI-specific PLC of C. reinhardtii (CrPLC) can modulate its interaction affinity with a set of phospholipids and also the stability of its dimer. CrPLC was found to form a mixture of higher oligomeric states with monomer and dimer as major species. Dimer adduct of CrPLC disappeared in the presence of DTT, which suggested the involvement of disulphide bond(s) in CrPLC oligomerization. Dimer-monomer equilibrium studies with the isolated fractions of CrPLC monomer and dimer supported the involvement of covalent forces in the dimerization of CrPLC. A disulphide bridge was found to be responsible for the dimerization and Cys7 seems to be involved in the formation of the disulphide bond. This crucial disulphide bond also modulated the lipid affinity of CrPLC. Oligomers of CrPLC were also captured in in vivo condition. CrPLC was mainly found to be localized in the plasma membrane of the cell. The cell surface localization of CrPLC may have significant implication in the downstream regulatory function of CrPLC. SIGNIFICANCE: This study helps in establishing the role of CrPLC (or similar proteins) in the quaternary structure of the molecule its affinities during lipid interactions
    • …
    corecore